AITopics | Pelagonia Statistical Region

Collaborating Authors

Pelagonia Statistical Region

Dialectal and Low-Resource Machine Translation for Aromanian

Jerpelea, Alexandru-Iulius, Rădoi, Alina, Nisioi, Sergiu

arXiv.org Artificial IntelligenceJan-7-2025

This paper presents the process of building a neural machine translation system with support for English, Romanian, and Aromanian - an endangered Eastern Romance language. The primary contribution of this research is twofold: (1) the creation of the most extensive Aromanian-Romanian parallel corpus to date, consisting of 79,000 sentence pairs, and (2) the development and comparative analysis of several machine translation models optimized for Aromanian. To accomplish this, we introduce a suite of auxiliary tools, including a language-agnostic sentence embedding model for text mining and automated evaluation, complemented by a diacritics conversion system for different writing standards. This research brings contributions to both computational linguistics and language preservation efforts by establishing essential resources for a historically under-resourced language. All datasets, trained models, and associated tools are public: https://huggingface.co/aronlp and https://arotranslate.com

aromanian, computational linguistic, translation, (14 more...)

arXiv.org Artificial Intelligence

2410.17728

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
(12 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Artificial Intelligence in Cybersecurity: Building Resilient Cyber Diplomacy Frameworks

Stoltz, Michael

arXiv.org Artificial IntelligenceNov-17-2024

This paper explores how automation and artificial intelligence (AI) are transforming U.S. cyber diplomacy. Leveraging these technologies helps the U.S. manage the complexity and urgency of cyber diplomacy, improving decision-making, efficiency, and security. As global inter connectivity grows, cyber diplomacy, managing national interests in the digital space has become vital. The ability of AI and automation to quickly process vast data volumes enables timely responses to cyber threats and opportunities. This paper underscores the strategic integration of these tools to maintain U.S. competitive advantage and secure national interests. Automation enhances diplomatic communication and data processing, freeing diplomats to focus on strategic decisions. AI supports predictive analytics and real time decision making, offering critical insights and proactive measures during high stakes engagements. Case studies show AIs effectiveness in monitoring cyber activities and managing international cyber policy. Challenges such as ethical concerns, security vulnerabilities, and reliance on technology are also addressed, emphasizing human oversight and strong governance frameworks. Ensuring proper ethical guidelines and cybersecurity measures allows the U.S. to harness the benefits of automation and AI while mitigating risks. By adopting these technologies, U.S. cyber diplomacy can become more proactive and effective, navigating the evolving digital landscape with greater agility.

data mining, diplomacy, real time system, (14 more...)

arXiv.org Artificial Intelligence

2411.13585

Country:

Europe > Romania (0.05)
Europe > United Kingdom (0.05)
Asia > Middle East > Iran (0.05)
(2 more...)

Genre:

Research Report (0.84)
Overview (0.69)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Architecture > Real Time Systems (0.88)
Information Technology > Data Science > Data Mining (0.69)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.49)

Add feedback

Hybrid Approach to Identify Druglikeness Leading Compounds against COVID-19 3CL Protease

Aqeel, Imra, Majid, Abdul

arXiv.org Artificial IntelligenceAug-24-2022

SARS-COV-2 is a positive single-strand RNA-based macromolecule that has caused the death of more than 6.3 million people since June 2022. Moreover, by disturbing global supply chains through lockdown, the virus has indirectly caused devastating damage to the global economy. It is vital to design and develop drugs for this virus and its various variants. In this paper, we developed an in-silico study-based hybrid framework to repurpose existing therapeutic agents in finding drug-like bioactive molecules that would cure Covid-19. We employed the Lipinski rules on the retrieved molecules from the ChEMBL database and found 133 drug-likeness bioactive molecules against SARS coronavirus 3CL Protease. Based on standard IC50, the dataset was divided into three classes active, inactive, and intermediate. Our comparative analysis demonstrated that the proposed Extra Tree Regressor (ETR) based QSAR model has improved prediction results related to the bioactivity of chemical compounds as compared to Gradient Boosting, XGBoost, Support Vector, Decision Tree, and Random Forest based regressor models. ADMET analysis is carried out to identify thirteen bioactive molecules with ChEMBL IDs 187460, 190743, 222234, 222628, 222735, 222769, 222840, 222893, 225515, 358279, 363535, 365134 and 426898. These molecules are highly suitable drug candidates for SARS-COV-2 3CL Protease. In the next step, the efficacy of bioactive molecules is computed in terms of binding affinity using molecular docking and then shortlisted six bioactive molecules with ChEMBL IDs 187460, 222769, 225515, 358279, 363535, and 365134. These molecules can be suitable drug candidates for SARS-COV-2. It is anticipated that the pharmacologist/drug manufacturer would further investigate these six molecules to find suitable drug candidates for SARS-COV-2. They can adopt these promising compounds for their downstream drug development stages.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/ph15111333

2208.06362

Country:

North America > United States (0.68)
Europe > Czechia > South Moravian Region > Brno (0.04)
Asia > Pakistan > Islamabad Capital Territory > Islamabad (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Government > Regional Government > North America Government > United States Government > FDA (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Chi-square-based scoring function for categorization of MEDLINE citations

Kastrin, Andrej, Peterlin, Borut, Hristovski, Dimitar

arXiv.org Machine LearningJun-5-2010

Objectives: Text categorization has been used in biomedical informatics for identifying documents containing relevant topics of interest. We developed a simple method that uses a chi-square-based scoring function to determine the likelihood of MEDLINE citations containing genetic relevant topic. Methods: Our procedure requires construction of a genetic and a nongenetic domain document corpus. We used MeSH descriptors assigned to MEDLINE citations for this categorization task. We compared frequencies of MeSH descriptors between two corpora applying chi-square test. A MeSH descriptor was considered to be a positive indicator if its relative observed frequency in the genetic domain corpus was greater than its relative observed frequency in the nongenetic domain corpus. The output of the proposed method is a list of scores for all the citations, with the highest score given to those citations containing MeSH descriptors typical for the genetic domain. Results: Validation was done on a set of 734 manually annotated MEDLINE citations. It achieved predictive accuracy of 0.87 with 0.69 recall and 0.64 precision. We evaluated the method by comparing it to three machine learning algorithms (support vector machines, decision trees, na\"ive Bayes). Although the differences were not statistically significantly different, results showed that our chi-square scoring performs as good as compared machine learning algorithms. Conclusions: We suggest that the chi-square scoring is an effective solution to help categorize MEDLINE citations. The algorithm is implemented in the BITOLA literature-based discovery support system as a preprocessor for gene symbol disambiguation process.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1006.1029

Country:

North America > United States (1.00)
Europe > North Macedonia > Pelagonia Statistical Region > Bitola Municipality > Bitola (0.25)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback